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Abstract 



Correlations of SAT I: Reasoning Test scores (SAT I) and high school grade point 
average (HSGPA) with freshman grade point average (FGPA) were sudied in a sample of 23 
colleges. The SAT I predicts FGPA about equally well across different ethnic groups. 
Correlations of the SAT I and the composite of SAT I scores and high school grade point 
average with FGPA were generally higher for women than for men, although this pattern was 
reversed at the most highly selective colleges. Adjusting for differences in course grading 
policies increased correlations by about .05. When a single prediction equation was used for all 
students, men tended to get lower grades than predicted and women got higher grades than 
predicted. Adjustments for course difficulty reduced underprediction, and there was no 
underprediction for women who intended to major in mathematics or scientific fields. African 
American and Hispanic/Hispanic/Latino men received lower grades than predicted, but women 
in these groups performed as predicted by the composite 
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Colleges use SAT I; Reasoning Test (SAT I) scores as a supplement to other information, 
notably the high school grade point average (HSGPA), to make selection decisions. We 
examined the utility of SAT I and HSGPA both individually and combined as predictors of 
college grades. We recognize that freshman grades are only one indicator of success in college 
and that much can be gained from considering a broader perspective (Willingham, 1985); 
nevertheless, the freshman GPA (FGPA) is an important indicator because it reflects a 
cumulative judgment of the quality of college-level academic performance made by a number of 
faculty members in several different disciplines. Although the four-year average might be a 
preferable criterion, research reviews suggest that there is little or no difference in the size of 
validity coefficients based on FGPA and those based on the cumulative four-year average 
(Wilson, 1983; Burton &. Ramist, in press). 

Because students select colleges and colleges select students, the range of SAT scores 
and HSGPAs found among the enrolled students at a particular college can be much narrower 
than the range found in the potential applicant population. This restriction in range tends to 
reduce correlations with FGPA that can be computed only for enrolled students; the real question 
of interest is how well do the scores predict for potential applicants, not for enrolled students. 
Therefore, correlations were adjusted to estimate what they would have been if the range of SAT 
I scores and HSGPAs was the same for a given college as for the full national cohort of college- 
bound seniors taking the SAT I. 

An additional question of interest was the extent to which the SAT I yielded over- or 
underpredictions, that is, whether predictions based on the total group were either too high or too 
low for specific subgroups. Overprediction occurs when a subgroup does not perform as well as 
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predicted; their predicted performance is above, or over, their actual performance. A common 
finding is that college grades of women are underpredicted and grades of ethnic minorities are 
overpredicted (Breland, 1979; Linn, 1978; Ramist, Lewis, & McCamley-Jenkins, 1994; Sawyer, 
1986). These studies all examined gender and ethnic groups separately rather than examining 
gender effects within ethnic groups. This leaves open the question of whether FGPA for women 
from minority groups is over- or underpredicted. A study of African American and 
Hispanic/Latino women in three colleges suggested that their scores were slightly underpredicted 
by SAT scores when the predictions were based on males from the same ethnic group (Pennock- 
Roman, 1994). However, Pennock-Roman did not evaluate over/underprediction within 
gender/ethnic groups when the original predictions were based on the regression for all students. 

Method 

Sample 

Data for the 1995 entering class was provided by 23 colleges. The colleges in the sample 
represented a combination of public and private institutions (13 public and 10 private), including 
one junior college. One college had only female students. Each of the six College Board 
geographical regions was represented, but the sample should not be considered as a nationally 
representative sample in a strict sampling sense. In particular, most of the colleges were well 
above average in selectivity and had relatively high SAT I scores in their freshman classes. 

Seven colleges had average Verbal + Math SAT I scores above 1250, and only two colleges had 
average scores below 1000. In the sample, average scores on the recentered SAT Program scale 
were 566 Verbal and 581 Math compared to 504 and 506 for all college-bound seniors in 1995 
(College Board/Educational Testing Service, 1995). 



Variables 



Colleges were asked to provide the freshman grade point average (FGPA) for all students 
in the 1995 entering classes. In addition, they were asked to provide grades in individual 
courses, but only seven colleges sent this course-level information. SAT I scores were extracted 
from SAT Program files at ETS. Demographic information was obtained from the Student 
Descriptive Questionnaire (SDQ) which about 95% of the students voluntarily complete when 
they register to take tests in the SAT program. The self-reported HSGPA was also obtained from 
the SDQ. This HSGPA contains 12 categories from F through A+. This HSGPA was coded 
such that an F = 0, D- = .7, D = 1.0, D+= 1.3. . .A+ = 4.3. FGPA was similarly coded from 0 to 
4.3, though 4.0 was the top score for many colleges that did not use A+ grades. Previous 
research suggests that using the self-reported HSGPA from the SDQ results in multiple 
correlations (combining SAT scores and HSGPA to predict FGPA) that are about .03 to .04 
points smaller than multiple correlations that use the actual school-reported HSGPA (Freeberg, 
Rock, & Pollack, 1989). 

Procedures 

Correlations of predictors with FGPA were corrected for range restriction with the 
Pearson-Lawley multivariate correction (Gulliksen, 1950, pp. 165-166). This adjustment 
requires the national standard deviations for the predictors as well as their intercorrelations. For 
the old (unrecentered) SAT Program scale, these SDs were as follows; SAT I-V, 1 12; SAT I-M, 
124; HSGPA, 0.66. For the recentered scale the SDs were: SAT I-V, 110; SAT I-M, 111. 
Correlations were the same for old and recentered scales and were as follows; SAT I-V with 
SAT-M, .71; SAT I-V with HSGPA, .48; SAT I-M with HSGPA, .53. 
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All correlations with FGPA were computed within colleges, weighted by the number of 
students at that college, and averaged across colleges. Similarly, multiple correlations that used 
more than one predictor were computed within college and then the weighted average taken 
across colleges. If any predictor in the multiple correlations had a negative weight, the multiple 
correlation was recomputed with that variable removed. 

Over/underprediction was analyzed by making predictions based on all students in a 
college and then, for each gender within ethnic subgroup, computing the difference between the 
predicted and actual FGPA (predicted GPA minus actual GPA). The result is in grade point 
units, with positive values indicating overprediction and negative values indicating 
underprediction. Two colleges were excluded from the averages — one had only female students 
and the other used a 0-15 scale for FGPA rather than the 0-4 (or 0-4.3) scale used at the other 
colleges. 

Results and Discussion 

Correlations for Colleges in Three Selectivity Ranges 

Correlations for colleges in three selectivity ranges are shown in Table 1. These 
correlations are adjusted for restriction in range. Such adjustments are especially useful when 
comparisons are being made across categories in which there is more range restriction in one 
category than another; specifically, there is greater restriction in the most highly-selective 
category. 

Consistent with previous findings (Ramist, Lewis, and McCamley-Jenkins, 1994), 
correlations tended to be higher for the most selective institutions. In addition, the SAT I 
increment, that is the extent to which SAT I scores improve predictions over HSGPA alone. 
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tended to be greatest for the most selective colleges. The SAT I increment was .04 in the lowest 
category, .06 in the middle category, and .09 in the highest category. 

The pattern of higher correlations in the most selective colleges was not replicated in all 
gender/ethnic groups. Note, for example, that the V + M correlation for the most selective 
colleges was no higher than for the least selective colleges for African American, Asian 
American, and Hispanic/Latino females and for African American males as well. In the 
relatively large White sample, the correlation in the most selective colleges was much higher for 
males (.23 and .20 higher compared to the low and middle groups respectively) but only 
marginally higher for females (.01 and .05 respectively). As indicated in Figure 1, correlations 
for the combined ethnic groups suggest that the SAT I (V + M) is a better predictor for women 
than for men at the less selective colleges, but that it predicts FGPA equally well for men and 
women at the most selective colleges. These data are consistent with the argument that 
behaviors unrelated to the developed abilities measured by the SAT, such as failing to attend 
class or complete assignments on time, may be more common in males and therefore make male 
grades more difficult to predict. (Strieker, Rock, & Burton, 1991). Because males at the most 
highly selective colleges may be as likely as females to attend class and complete assignments, 
tested abilities should be equally valid for men and women at these highly selective institutions. 

For the V + M + H composite, the same pattern seen for SAT I scores alone was 
repeated — grades of females were predicted more accurately in the less selective colleges, but 
grades of males were predicted more accurately at the most selective colleges. This pattern was 
especially evident in the White sample. In the least selective colleges, the correlation was higher 
for females by .08, but in the most selective colleges, the correlation was .05 higher for males. 
The previous Ramist et al. (1994) study found a similar pattern with higher correlations for 
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women at the less selective schools in their sample, and a very small male advantage at the most 
selective schools; they did not provide information on gender within ethnic within selectivity 
categories. The same pattern could also be observed in the colleges studied by Pennock-Roman 
(1994). The two colleges in her sample that would be classified in the most selective group in 
our sample both showed higher correlations for White males than for White females. In the two 
less selective colleges, the pattern was reversed. 

Over- and Underpredictions 

Table 2 presents the over- and underpredictions of FGPA. Consistent with previous 
findings (e.g., Ramist et al. 1994; Pennock-Roman, 1994), there was a modest underprediction of 
women’s grades and the complimentary overprediction of men’s grades. As indicated in Figure 
2, for the three ethnic minority groups studied, there was virtually no over- or underprediction of 
women’s grades from the combination of SAT I scores and HSGPA. There was moderate to 
substantial overprediction of men’s grades. For all of the three minority groups, but especially 
for the African American and Hispanic/Latino groups, there was overprediction of grades for 
men, i.e., men did not perform as well in college as would be expected from their high school 
grades and SAT scores. Note that in these groups the overprediction was as great for HSGPA by 
itself as for the SAT by itself In the most highly selective colleges, the underprediction of 
women’s grades from the SAT and HSGPA composite was slightly less, ranging from -.04 to - 
.05. 

Correlations within Parental Education and Income Categories 

Socioeconomic categories, such as the highest educational degree earned by either parent, 
interact with ethnic categories in a way that makes it difficult to attribute results to ethnic as 
opposed to socioeconomic categories. In an attempt to disentangle these effects, we ran 
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correlations for the ethnic/gender groups separately in three parental education categories. The 
parent education categories were derived from responses on the Student Descriptive 
Questionnaire. Students responded for both mother’s and father’s education level, and we used 
whichever parent had the highest level. We used three categories: high school diploma or less, 
bachelors degree, and graduate degree. Students with parents who had some graduate work but 
no graduate degree were included in the bachelors degree category; students whose parents had 
some college but no degree were not included in the analysis. 

As shown in Figure 3, across ethnic groups college grades tend to be more predictable for 
students whose parents have more education. Within each parental education category, grades 
were most predictable for Asian American males, but within-category trends were less clear for 
the other groups. For example, within the college degree category, V+ M + H correlations were 
just as high for African American males as for White males, but correlations for African 
American females appeared to be relatively low. In the high school diploma category, V + M + 
H correlations were as high for African American females as for White females. As indicated in 
Figure 4, analyses run within family income categories revealed the same trends with some 
within-category variation but a tendency for correlations to be highest in the highest income 
category. 

Adjustment for Course Difficulty 

Because grading standards differ across courses, students in leniently-graded courses may 
receive higher grades, on average, than students with the same academic background who take 
strictly-graded courses. For some students, the FGPA may consist primarily of leniently-graded 
courses while for other students the FGPA may consist primarily of strictly-graded courses. 

Given that students with the highest scores on admissions tests often select the scientific and 
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quantitative courses that are graded most strictly, the correlation between admission test scores 
and FGPA can be attenuated (Goldman & Widawski, 1976; Elliott & Strenta, 1988; Ramist, 
Lewis, & McCamley- Jenkins, 1994). 

For the current sample, we were able to make adjustments in the seven colleges that 
provided grades in individual courses. A number of different adjustment methods have been 
proposed and evaluated (Strieker, Rock, Burton, Muraki, & Jirele, 1994). We used three 
adjustment methods: a within-course predicted FGPA, a course- grade residual analysis, and an 
analysis within intended college major. 

The within-course predicted FGPA followed the procedure outlined by Ramist, Lewis, 
and McCamley-Jenkins (1994). In this adjustment method, admissions scores are used to make 
linear regression grade predictions in each course containing at least seven freshmen. For each 
student, the predicted grade for each course taken is averaged over all of the courses taken by 
that student to form a predicted FGPA for that student. The predicted FGPA is then correlated 
with the actual FGPA. We performed the within-course predictions separately for each predictor 
(V, M, and H) as well as for the combinations of these predictors (V + M and V + M + H). If 
any equation contained negative regression weights, we removed the variable with the negative 
weight and recomputed the correlation using the remaining predictors. For courses with Just a 
few students, the weight for a single predictor could be negative; in these cases we substituted 
the mean grade in the course for the regression estimate. Because regression estimates based on 
optimal weighting of multiple predictors in relatively small samples may inflate correlations by 
capitalizing on chance, we also computed the V + M + H correlation based on uniform weights, 
that is the simple sum of V + M + (200 x H). We used the same uniform weight equation 



whether the course was predominantly verbal (such as English) or primarily quantitative (such as 
calculus), thus producing a very conservative estimate. 

As shown in Figure 5, the correction for course difficulty increased the V + M + H 
correlation by about .06 (from .43 to .49) with an additional increase to .65 when also adjusted 
for range restriction. These corrections for grading differences are somewhat smaller than those 
found by Ramist, Lewis, and McCamley-Jenkins (1994), but are consistent with those computed 
by Strieker et al. (1994). Figure 5 also shows that the conservative uniform weight correlations 
were nearly as high as those with the optimal regression weights. 

For the course-grade residual analysis, we used the overall V+M+FI prediction equation 
for a college to predict the FGPA for all of the students in a given course. The course residual 
was the difference between the predicted FGPA of the students in that course and the actual 
mean grade of the students in that course. Thus, each course had a residual value associated with 
it, with positive residuals indicating a course with higher grades than would be expected from the 
admissions scores of the students in that course, that is, a course with lenient grading; negative 
residuals indicated strict grading. For a given student, these residuals were averaged over all of 
the courses taken by that student. This mean residual was then used as an additional predictor 
(along with V, M, and H) in predicting the FGPA for a student. Results of this procedure were 
nearly identical to those for the predicted FGPA procedure (mean correlation over colleges of .50 
for mean grade-residual analysis compared to .49 for the predicted FGPA procedure). 

The third procedure did not directly adjust for differences in course grading; it merely 
grouped students into more homogeneous categories based on their intended college majors as 
indicated by their responses to the Student Descriptive Questionnaire. This method has obvious 
drawbacks in that students frequently change their intended majors before or after enrolling in 
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college, and even students with different majors can have a similar mix of courses during the 
freshman year. Nevertheless, this approach has the distinct advantage of not requiring colleges 
to supply any course-level information, so it could be used for all 23 of the colleges in the 
sample. We grouped all majors into two categories — “math/science” included majors in the 
physical and biological sciences, engineering, and mathematics; all other majors were put in the 
“other” category. As indicated in Figure 6, correlations were uniformly higher in the 
math/science group. Note that because these correlations were adjusted for range restriction the 
higher correlations in the math/science category cannot be attributed merely to greater variability 
of scores in that category. The means, standard deviations, and standardized differences (d) in 
Table 3 indicate that the difference in FGPA between the math/science and other groups is 
considerably smaller than the difference in any of the admissions measures, suggesting that 
grading standards were indeed more rigorous for students whose intended major was in a 
math/science field. 

Over/underprediction adjusted for course diffjcutly. Because members of different 
gender and ethnic subgroups may differentially sort themselves into courses with relatively strict 
or lenient grading standards, adjusting for course diffiuculty can also have an impact on the the 
extent to which grades are over- or underpredicted. For each subgroup, we used the FGPA 
predicted in the course grade residual analysis (V + M + H + Residual) that accounts for course 
difficulty differences, and then found the difference between the predicted and actual FGPA. 

The correction reduced the underprediction of women’s grades from -0.07 to -0.05; for women 
in the two colleges in the highly selective category that provided course grades, underprediction 
was reduced from -0.04 to -0.03 
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For the full sample of colleges, not just those that supplied course grades, 
over/underprediction results by intended major are presented in Table 4. Grade predictions were 
made without regard to intended major, but differences between predicted and actual grades were 
computed within major (math/science or other). Results were consistent with the notion that 
grading standards are more strict in math/science fields. In every gender/ethnic category, grades 
for students with intended majors in math/science were not as high as predicted from the V + M 
+ H equation for the entire college. The generalization that grades of women are underpredicted 
was not true for these math/science students, though the overprediction for men was notably 
larger than the overprediction for women. 

The results of all of the course-adjustment procedures underscore the importance of 
taking grading differences into account whenever possible for predictions of FGPA. When 
adjustments cannot be made, it should at least be acknowledged that the resulting correlations are 
underestimates of the ability of the admissions measures to predict college grades. 

Conclusions 

The SAT I appears to predict about equally well across ethnic groups. At most colleges, 
grades of females are more predictable than grades of males, but at the most highly selective 
colleges, the grades of males and females are predicted equally well. Males generally perform 
slightly worse in their freshman year than predicted from test scores and high school grades; 
women perform slightly better than predicted. Within African American and Hispanic/Latino 
groups, men perform worse than predicted and women perform about as predicted. 

Across ethnic groups, grades are more predictable for higher SES students that for lower 
SES students. This is true for both parent education and income definitions of SES. 
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Accounting for differences in course grading practices produces a noticeable 
improvement in predictions of FGPA. Even as simple a procedure as running correlations 
separately for students who indicate that they would like to be math/science majors has an 
impact on the size of validity coefficients. Validity coefficients would have been even larger had 
we adjusted for unreliability in the FGPA criterion as was done by Ramist et al. (1994). 

However, we had no way of adequately estimating the reliability of the FGPA for all of the 
schools in our sample. If grade reliability were about the same in our sample as in the Ramist et 
al. sample (a reasonable but unverifiable assumption), about .05 should be added to the adjusted 
correlations. Thus, for example, correcting for the unreliability of the FGPA would raise the 
correlation for the V+M+H composite in math/science students from .66 to .71. 

Many issues remain to be explored in future analyses of the data base created for this 
study. Additional data will be needed to explore longer-term validity issues. 
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Table 3 



Means and Standard Deviations of Scores 
and GPA’s by Intended Major 







Intended Major 




d 


Math/Science 


Other 




M 


SD 


M 


SD 


SAT I-V 


578 


84 


560 


84 


0.21 


SAT I-M 


628 


76 


564 


83 


0.81 


HSGPA 


3.70 


0.43 


3.54 


0.46 


0.36 


FGPA 


3.01 


0.79 


2.96 


0.74 


0.07 



Note. — The standardized difference between the Math/Science and Other 
categories, d, is the mean difference divided by the square root of the 
unweighted average of the squared standard deviations. 



Over (+) and Underprediction (-) of FGPA for Students with Math/Science or Other Intended Majors 

Gender African Asian Hispanic/ 



V 












o 



P o 

I 



VO 

00 

in 



On VO 



m 



^ § 

O O 

ON 



CN 

o 



C\i 



o 

c 

cd 



[X( 



VO 

VO 



00 ^ 
o 1— 




m 

o 



ro o O 
CN CN 



^ 

00 



c 

cd 

o 



<u 



c 

cd 

o 



<D 



U 



PUi 


© ^ 

W VO 


o 


(n 

o 




o 


.09 


00 

o 








i' 






CN 


1 


1 






















© 


CN 


r^ 


00 


u 

o 


m 

m 


o 


in 


2 


a 


O 


o 


<o 




VO 


1— H 


o 




© 


CN 
















a 


















NM 


















© 








© 








pui 


u 

a 


CN 


02 


m 

o 


'TS 

fi 

© 


m 


04 


m 

o 




*3 

C/} 


VO 


r 




a 




1 


I* 












NN 










■4^ 

Cd 


m 


CN 




© 


VO 


m 


CN 


2 




m 


CN 




o 


CN 

VO 


CN 





o 

CN 




cd 

O 

H 






CN 

m 



(?v 



m 

(?V 



P O 



m CN 



00 









CN 



<u 

O 

O 

C/5 



C 

<D 

-o 

D 

■ 4 -‘ 

C/5 

4-1 

O 

d) 

e 

3 

2 



+ 

> 



X 

+ 

+ 

> 



c 

<u 

-o 

D 

■ 4 -‘ 

C/3 

4-1 

O 

<D 

B 

D 

2 



S 

+ 

> 



X 

+ 

+ 

> 



cn 

cv 



Adjusted Correlation 




Mean Score Range (SAT I [V+M]) 



■ Males 
□ Females 



Figure 1. Adjusted correlations of SAT I (V+M) scores with FGPA for Males and 
Females in Colleges in Three Score Ranges. 
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Figure 2. Over (+) and Underprediction (-) of FGPA from SAT I (V+M) and 
HSGPA Composite for Four Ethnic Groups 
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Figure 3. Adjusted Correlations of the V+M+H Composite with FGPA by Parental Education 
Level for Males and Females in Four Ethnic Groups. 
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Figure 4. Adjusted Correlations of the V+M+H Composite with FGPA by Family Income Level 
for Males and Females in Four Ethnic Groups. 
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Figure 5. Effects of Correction for Course Difficulty and Range Restriction on Correlation of 
FGPA with V+M+H Composite (Optimal Weights and Uniform Weights). 
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Figure 6. Adjusted Correlations of the V+M+H Composite with FGPA by Intended Major for 
Males and Females in Four Ethnic Groups. 
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